Efficient Algorithms on Sequence Binary Decision Diagrams for Manipulating Sets of Strings
نویسندگان
چکیده
We consider sequence binary decision diagrams (sequence BDD or SDD, for short), which are compact representation for manipulating sets of strings, proposed by (Loekito, et al., Knowl. Inf. Syst., 24(2), 235-268, 2009). An SDD resembles to an acyclic DFA in binary form with different reduction rules from one for DFAs. In this paper, we study the power of SDDs for storing and manipulating sets of strings on shared and reduced SDDs. Particularly, we first give the characterization of minimal SDDs as reduced SDDs. Then, we present simple and efficient algorithms for various problems related to reduced and shared SDDs: on-the-fly and off-line minimization, dynamic string set construction, and factor SDD construction. Finally, we run experiments on real data sets that show the efficiency and usefulness of SDDs in large-scale string processing.
منابع مشابه
Studies on Decision Diagrams for Efficient Manipulation of Sets and Strings
In many real-life problems, we are often faced with manipulating discrete structures. Manipulation of large discrete structures is one of the most important problems in computer science. For this purpose, a family of data structures called decision diagrams is used. The origin of the decision diagrams is binary decision diagram (BDD) proposed by Bryant in 1980s. BDD is a data structure to repre...
متن کاملBuilding Substring Indices Using Sequence BDDs
(Abstract) There is a demand for efficient indexed-substring data structures, which can store all substrings of a given text. Suffix trees and Directed Acyclic Word Graphs (DAWGs) are examples of substring indices, but they lack operations for manipulating sets of strings. The Sequence Binary Decision Diagram (SeqBDD) data structure proposed) is a new type of Binary Decision Diagram (BDD), and ...
متن کاملSuffix-DDs: Substring Indices Based on Sequence BDDs for Constrained Sequence Mining
In this paper, we study an efficient index structure, called Suffix Decision Diagrams (SuffixDDs), for knowledge discovery in large sequence data. Recently, Loekito, Bailey, and Pei (KAIS, 2009) proposed a new data structure for sequence data, called Sequence Binary Decision Diagram (SeqBDD), which is an extension of Zero-suppressed Binary Decision Diagrams (ZDDs) for sequences. SuffixDD is a c...
متن کاملNotes on Sequence Binary Decision Diagrams: Relationship to Acyclic Automata and Complexities of Binary Set Operations
Manipulation of large sequence data is one of the most important problems in string processing. Recently, Loekito et al. (Knowl. Inf. Syst., 24(2), 235-268, 2009) have introduced a new data structure, called Sequence Binary Decision Diagrams (SeqBDDs, or SDDs), which are descendants of both acyclic DFAs (ADFAs) and binary decision diagrams (BDDs). SDDs can compactly represent sets of sequences ...
متن کاملGraphillion: ZDD-based Software Library for Very Large Sets of Graphs
Graphillion is a library for manipulating very large sets of graphs, based on zero-suppressed binary decision diagrams (ZDDs) with advanced graph enumeration algorithms. Graphillion is implemented as a Python extension in C++, to encourage easy development of its applications without introducing significant performance overhead. Experimental results show that Graphillion allows us to manage an ...
متن کامل